Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 19681 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 6 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 1.8 MiB |
| Average record size in memory | 96.0 B |
Variable types
| NUM | 8 |
|---|---|
| CAT | 4 |
| Dataset has 6 (< 0.1%) duplicate rows | Duplicates |
user_id has a high cardinality: 19675 distinct values | High cardinality |
devicebrand has a high cardinality: 100 distinct values | High cardinality |
devicebrand is highly correlated with attr_os_str | High correlation |
attr_os_str is highly correlated with devicebrand | High correlation |
avg_revenue is highly skewed (γ1 = 26.47550501) | Skewed |
cnt_call is highly skewed (γ1 = 30.02842689) | Skewed |
cnt_dis is highly skewed (γ1 = 47.58521106) | Skewed |
cnt_add_ons is highly skewed (γ1 = 47.99302987) | Skewed |
user_id is uniformly distributed | Uniform |
cnt_call has 18800 (95.5%) zeros | Zeros |
cnt_dis has 18681 (94.9%) zeros | Zeros |
cnt_mobile has 1472 (7.5%) zeros | Zeros |
cnt_internet has 18713 (95.1%) zeros | Zeros |
cnt_tv has 16062 (81.6%) zeros | Zeros |
cnt_voice has 19260 (97.9%) zeros | Zeros |
cnt_add_ons has 10600 (53.9%) zeros | Zeros |
Reproduction
| Analysis started | 2020-12-13 17:09:14.230245 |
|---|---|
| Analysis finished | 2020-12-13 17:09:39.845533 |
| Duration | 25.62 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 19675 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 153.8 KiB |
| 2917446999 | 2 |
|---|---|
| 2514811034 | 2 |
| 3103546688 | 2 |
| TMCZ_9613733 | 2 |
| 100487292 | 2 |
| Other values (19670) |
| Value | Count | Frequency (%) | |
| 2917446999 | 2 | < 0.1% | |
| 2514811034 | 2 | < 0.1% | |
| 3103546688 | 2 | < 0.1% | |
| TMCZ_9613733 | 2 | < 0.1% | |
| 100487292 | 2 | < 0.1% | |
| TMCZ_6009584899 | 2 | < 0.1% | |
| 955718ff-519a-48ed-8f49-0fd7a76c741b | 1 | < 0.1% | |
| TMCZ_6007561144 | 1 | < 0.1% | |
| 2343743 | 1 | < 0.1% | |
| 3286175011 | 1 | < 0.1% | |
| Other values (19665) | 19665 | 99.9% |
Frequencies of value counts
Unique
| Unique | 19669 ? |
|---|---|
| Unique (%) | 99.9% |
Histogram of lengths of the category
Length
| Max length | 75 |
|---|---|
| Median length | 10 |
| Mean length | 15.2909405 |
| Min length | 4 |
| Distinct | 17615 |
|---|---|
| Distinct (%) | 89.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.33600217 |
|---|---|
| Minimum | 0 |
| Maximum | 5885.225 |
| Zeros | 10 |
| Zeros (%) | 0.1% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2.4689 |
| Q1 | 10.0833 |
| median | 22.53333333 |
| Q3 | 46.6347 |
| 95-th percentile | 120.8426667 |
| Maximum | 5885.225 |
| Range | 5885.225 |
| Interquartile range (IQR) | 36.5514 |
Descriptive statistics
| Standard deviation | 99.50439393 |
|---|---|
| Coefficient of variation (CV) | 2.466887856 |
| Kurtosis | 1072.73635 |
| Mean | 40.33600217 |
| Median Absolute Deviation (MAD) | 15.00273333 |
| Skewness | 26.47550501 |
| Sum | 793852.8588 |
| Variance | 9901.124411 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 9.5 | 120 | 0.6% | |
| 11.4 | 77 | 0.4% | |
| 6.6 | 53 | 0.3% | |
| 5 | 50 | 0.3% | |
| 5.5 | 40 | 0.2% | |
| 10 | 36 | 0.2% | |
| 19 | 31 | 0.2% | |
| 7.8 | 28 | 0.1% | |
| 5.7 | 28 | 0.1% | |
| 3.9 | 26 | 0.1% | |
| Other values (17605) | 19192 | 97.5% |
| Value | Count | Frequency (%) | |
| 0 | 10 | 0.1% | |
| 0.0007333333333 | 1 | < 0.1% | |
| 0.0021 | 1 | < 0.1% | |
| 0.002514285714 | 1 | < 0.1% | |
| 0.0039 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5885.225 | 1 | < 0.1% | |
| 3912.63625 | 1 | < 0.1% | |
| 3801.801 | 1 | < 0.1% | |
| 3222.741 | 1 | < 0.1% | |
| 2977.519091 | 1 | < 0.1% |
nc
Categorical
| Distinct | 10 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 153.8 KiB |
| ro | |
|---|---|
| cz | |
| hr | |
| pl | |
| sk | |
| Other values (5) |
| Value | Count | Frequency (%) | |
| ro | 8890 | 45.2% | |
| cz | 3040 | 15.4% | |
| hr | 2935 | 14.9% | |
| pl | 1621 | 8.2% | |
| sk | 903 | 4.6% | |
| mk | 878 | 4.5% | |
| hu | 750 | 3.8% | |
| me | 503 | 2.6% | |
| at | 91 | 0.5% | |
| heyah | 70 | 0.4% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 5 |
|---|---|
| Median length | 2 |
| Mean length | 2.01067019 |
| Min length | 2 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 153.8 KiB |
| ANDROID | |
|---|---|
| IOS |
| Value | Count | Frequency (%) | |
| ANDROID | 17768 | 90.3% | |
| IOS | 1913 | 9.7% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.611198618 |
| Min length | 3 |
| Distinct | 100 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 153.8 KiB |
| samsung | |
|---|---|
| HUAWEI | |
| Apple | |
| xiaomi | 665 |
| Xiaomi | 509 |
| Other values (95) |
| Value | Count | Frequency (%) | |
| samsung | 9785 | 49.7% | |
| HUAWEI | 4192 | 21.3% | |
| Apple | 1913 | 9.7% | |
| xiaomi | 665 | 3.4% | |
| Xiaomi | 509 | 2.6% | |
| HONOR | 349 | 1.8% | |
| Redmi | 329 | 1.7% | |
| motorola | 263 | 1.3% | |
| Nokia | 235 | 1.2% | |
| Sony | 221 | 1.1% | |
| Other values (90) | 1220 | 6.2% |
Frequencies of value counts
Unique
| Unique | 32 ? |
|---|---|
| Unique (%) | 0.2% |
Histogram of lengths of the category
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 6.306590112 |
| Min length | 2 |
| Distinct | 114 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.130328743 |
|---|---|
| Minimum | 0 |
| Maximum | 705 |
| Zeros | 18800 |
| Zeros (%) | 95.5% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 705 |
| Range | 705 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 11.43979625 |
|---|---|
| Coefficient of variation (CV) | 10.12076913 |
| Kurtosis | 1367.630001 |
| Mean | 1.130328743 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 30.02842689 |
| Sum | 22246 |
| Variance | 130.8689384 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 18800 | 95.5% | |
| 3 | 59 | 0.3% | |
| 1 | 59 | 0.3% | |
| 6 | 48 | 0.2% | |
| 4 | 46 | 0.2% | |
| 5 | 43 | 0.2% | |
| 2 | 41 | 0.2% | |
| 8 | 35 | 0.2% | |
| 7 | 29 | 0.1% | |
| 11 | 28 | 0.1% | |
| Other values (104) | 493 | 2.5% |
| Value | Count | Frequency (%) | |
| 0 | 18800 | 95.5% | |
| 1 | 59 | 0.3% | |
| 2 | 41 | 0.2% | |
| 3 | 59 | 0.3% | |
| 4 | 46 | 0.2% |
| Value | Count | Frequency (%) | |
| 705 | 1 | < 0.1% | |
| 579 | 1 | < 0.1% | |
| 471 | 1 | < 0.1% | |
| 364 | 1 | < 0.1% | |
| 343 | 1 | < 0.1% |
| Distinct | 713 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 42.72354047 |
|---|---|
| Minimum | 0 |
| Maximum | 35711 |
| Zeros | 18681 |
| Zeros (%) | 94.9% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 17 |
| Maximum | 35711 |
| Range | 35711 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 544.0456197 |
|---|---|
| Coefficient of variation (CV) | 12.73409492 |
| Kurtosis | 2849.520163 |
| Mean | 42.72354047 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 47.58521106 |
| Sum | 840842 |
| Variance | 295985.6363 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 18681 | 94.9% | |
| 118 | 6 | < 0.1% | |
| 82 | 5 | < 0.1% | |
| 219 | 5 | < 0.1% | |
| 75 | 5 | < 0.1% | |
| 176 | 4 | < 0.1% | |
| 570 | 4 | < 0.1% | |
| 72 | 4 | < 0.1% | |
| 346 | 4 | < 0.1% | |
| 85 | 4 | < 0.1% | |
| Other values (703) | 959 | 4.9% |
| Value | Count | Frequency (%) | |
| 0 | 18681 | 94.9% | |
| 2 | 3 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 35711 | 1 | < 0.1% | |
| 35616 | 1 | < 0.1% | |
| 34153 | 1 | < 0.1% | |
| 22164 | 1 | < 0.1% | |
| 16149 | 1 | < 0.1% |
| Distinct | 32 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.015344749 |
|---|---|
| Minimum | 0 |
| Maximum | 50 |
| Zeros | 1472 |
| Zeros (%) | 7.5% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.802174695 |
|---|---|
| Coefficient of variation (CV) | 0.8942265071 |
| Kurtosis | 64.51463549 |
| Mean | 2.015344749 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 4.742233027 |
| Sum | 39664 |
| Variance | 3.247833632 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=32)
| Value | Count | Frequency (%) | |
| 1 | 8468 | 43.0% | |
| 2 | 4312 | 21.9% | |
| 3 | 2563 | 13.0% | |
| 0 | 1472 | 7.5% | |
| 4 | 1451 | 7.4% | |
| 5 | 726 | 3.7% | |
| 6 | 296 | 1.5% | |
| 7 | 182 | 0.9% | |
| 8 | 68 | 0.3% | |
| 9 | 44 | 0.2% | |
| Other values (22) | 99 | 0.5% |
| Value | Count | Frequency (%) | |
| 0 | 1472 | 7.5% | |
| 1 | 8468 | 43.0% | |
| 2 | 4312 | 21.9% | |
| 3 | 2563 | 13.0% | |
| 4 | 1451 | 7.4% |
| Value | Count | Frequency (%) | |
| 50 | 1 | < 0.1% | |
| 38 | 1 | < 0.1% | |
| 36 | 1 | < 0.1% | |
| 34 | 1 | < 0.1% | |
| 28 | 1 | < 0.1% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.05700929831 |
|---|---|
| Minimum | 0 |
| Maximum | 16 |
| Zeros | 18713 |
| Zeros (%) | 95.1% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 16 |
| Range | 16 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.2886782649 |
|---|---|
| Coefficient of variation (CV) | 5.06370493 |
| Kurtosis | 505.2999286 |
| Mean | 0.05700929831 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.10083127 |
| Sum | 1122 |
| Variance | 0.08333514061 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) | |
| 0 | 18713 | 95.1% | |
| 1 | 851 | 4.3% | |
| 2 | 100 | 0.5% | |
| 3 | 11 | 0.1% | |
| 4 | 3 | < 0.1% | |
| 5 | 2 | < 0.1% | |
| 16 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 18713 | 95.1% | |
| 1 | 851 | 4.3% | |
| 2 | 100 | 0.5% | |
| 3 | 11 | 0.1% | |
| 4 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 16 | 1 | < 0.1% | |
| 5 | 2 | < 0.1% | |
| 4 | 3 | < 0.1% | |
| 3 | 11 | 0.1% | |
| 2 | 100 | 0.5% |
| Distinct | 14 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4159341497 |
|---|---|
| Minimum | 0 |
| Maximum | 14 |
| Zeros | 16062 |
| Zeros (%) | 81.6% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 3 |
| Maximum | 14 |
| Range | 14 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.144782328 |
|---|---|
| Coefficient of variation (CV) | 2.752316271 |
| Kurtosis | 17.71448603 |
| Mean | 0.4159341497 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.816086838 |
| Sum | 8186 |
| Variance | 1.310526578 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=14)
| Value | Count | Frequency (%) | |
| 0 | 16062 | 81.6% | |
| 1 | 1733 | 8.8% | |
| 2 | 757 | 3.8% | |
| 3 | 436 | 2.2% | |
| 4 | 273 | 1.4% | |
| 5 | 197 | 1.0% | |
| 6 | 107 | 0.5% | |
| 7 | 65 | 0.3% | |
| 8 | 26 | 0.1% | |
| 9 | 11 | 0.1% | |
| Other values (4) | 14 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 16062 | 81.6% | |
| 1 | 1733 | 8.8% | |
| 2 | 757 | 3.8% | |
| 3 | 436 | 2.2% | |
| 4 | 273 | 1.4% |
| Value | Count | Frequency (%) | |
| 14 | 1 | < 0.1% | |
| 12 | 2 | < 0.1% | |
| 11 | 2 | < 0.1% | |
| 10 | 9 | < 0.1% | |
| 9 | 11 | 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.02342360652 |
|---|---|
| Minimum | 0 |
| Maximum | 9 |
| Zeros | 19260 |
| Zeros (%) | 97.9% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.1731759933 |
|---|---|
| Coefficient of variation (CV) | 7.393224999 |
| Kurtosis | 417.0160229 |
| Mean | 0.02342360652 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.26119923 |
| Sum | 461 |
| Variance | 0.02998992466 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) | |
| 0 | 19260 | 97.9% | |
| 1 | 390 | 2.0% | |
| 2 | 28 | 0.1% | |
| 3 | 2 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 19260 | 97.9% | |
| 1 | 390 | 2.0% | |
| 2 | 28 | 0.1% | |
| 3 | 2 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9 | 1 | < 0.1% | |
| 3 | 2 | < 0.1% | |
| 2 | 28 | 0.1% | |
| 1 | 390 | 2.0% | |
| 0 | 19260 | 97.9% |
| Distinct | 192 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.349778975 |
|---|---|
| Minimum | 0 |
| Maximum | 2670 |
| Zeros | 10600 |
| Zeros (%) | 53.9% |
| Memory size | 153.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 2670 |
| Range | 2670 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 28.46838937 |
|---|---|
| Coefficient of variation (CV) | 4.483366977 |
| Kurtosis | 4030.342506 |
| Mean | 6.349778975 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 47.99302987 |
| Sum | 124970 |
| Variance | 810.4491932 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 10600 | 53.9% | |
| 1 | 1627 | 8.3% | |
| 2 | 1070 | 5.4% | |
| 3 | 779 | 4.0% | |
| 4 | 641 | 3.3% | |
| 5 | 527 | 2.7% | |
| 6 | 439 | 2.2% | |
| 7 | 346 | 1.8% | |
| 8 | 313 | 1.6% | |
| 9 | 275 | 1.4% | |
| Other values (182) | 3064 | 15.6% |
| Value | Count | Frequency (%) | |
| 0 | 10600 | 53.9% | |
| 1 | 1627 | 8.3% | |
| 2 | 1070 | 5.4% | |
| 3 | 779 | 4.0% | |
| 4 | 641 | 3.3% |
| Value | Count | Frequency (%) | |
| 2670 | 1 | < 0.1% | |
| 961 | 1 | < 0.1% | |
| 761 | 1 | < 0.1% | |
| 684 | 1 | < 0.1% | |
| 599 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| user_id | avg_revenue | nc | attr_os_str | devicebrand | cnt_call | cnt_dis | cnt_mobile | cnt_internet | cnt_tv | cnt_voice | cnt_add_ons | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 000fa182-b7ab-4856-9c5e-a63701385979 | 25.364000 | mk | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 2.0 |
| 1 | 001b4889-c295-4ab5-a7fb-c636ece80890 | 2.000000 | at | IOS | Apple | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | 001db037-2940-4e66-abab-135af3338eec | 18.241600 | hr | ANDROID | samsung | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 3 | 002c732e-63bb-4d24-be14-51d6415e7b8a | 15.428400 | hr | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | 002c968a-d3e6-428b-be16-36991bae1876 | 0.555556 | me | ANDROID | HONOR | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 5 | 003a2fea-9e23-48b7-a704-6f5c3ae2d6a4 | 35.491820 | hr | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| 6 | 00758022-9b70-441d-a317-d1c502b0f03a | 142.255100 | hr | ANDROID | samsung | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 3.0 |
| 7 | 008582ed-314a-4abb-b9d5-c95c647eae11 | 33.215000 | me | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 8 | 0087ac6adda43eb8b5bf62b0e1bd9b8dac25030ea15344ea56ed9afb26f6dc2e | 5.345000 | sk | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 9 | 00968692-5460-4ea8-9a8e-a93000d0ebef | 71.436000 | mk | ANDROID | samsung | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | 0.0 | 3.0 |
Last rows
| user_id | avg_revenue | nc | attr_os_str | devicebrand | cnt_call | cnt_dis | cnt_mobile | cnt_internet | cnt_tv | cnt_voice | cnt_add_ons | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19671 | ff1a37bd-8a60-45db-8522-b5d02981f491 | 94.110250 | hr | ANDROID | HUAWEI | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 12.0 |
| 19672 | ff2a6ca3-efd4-4480-85a6-ec6de935e0d1 | 13.804227 | hr | ANDROID | htc | 0.0 | 0.0 | 2.0 | 1.0 | 1.0 | 1.0 | 3.0 |
| 19673 | ff3fa14e-5051-469a-9db2-a78601531f12 | 337.584000 | mk | ANDROID | samsung | 0.0 | 0.0 | 3.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| 19674 | ff4637aa73f626ce2fa617ffad4b825a8f975f0d78cdb7914da23e800140da4a | 161.564000 | sk | ANDROID | samsung | 0.0 | 0.0 | 6.0 | 0.0 | 0.0 | 0.0 | 5.0 |
| 19675 | ff46d44c-ee1b-4f7d-bd95-333abac5ad41 | 88.450000 | me | ANDROID | samsung | 0.0 | 0.0 | 5.0 | 0.0 | 1.0 | 0.0 | 2.0 |
| 19676 | ff87dff2-f330-4f0e-b404-ab2c008578d9 | 3.200000 | mk | ANDROID | HUAWEI | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 |
| 19677 | ffa08ebc-e4bb-46ea-ae48-1cc0b5a1b612 | 46.659513 | hr | ANDROID | Xiaomi | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | 0.0 | 6.0 |
| 19678 | ffa1111d-1406-4bc5-ad2f-d57c119de744 | 108.836000 | hr | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 19679 | ffb25cab-d77b-48b9-8a1d-093a894f1d4d | 36.926500 | hr | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 3.0 |
| 19680 | ffbd7171-b306-4f19-9077-ac6300c3deb9 | 3.200000 | mk | ANDROID | HUAWEI | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Most frequent
| user_id | avg_revenue | nc | attr_os_str | devicebrand | cnt_call | cnt_dis | cnt_mobile | cnt_internet | cnt_tv | cnt_voice | cnt_add_ons | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 100487292 | 46.601818 | sk | ANDROID | Xiaomi | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 6.0 | 2 |
| 1 | 2514811034 | 1.542975 | ro | ANDROID | samsung | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2 |
| 2 | 2917446999 | 24.772364 | ro | ANDROID | samsung | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 2 |
| 3 | 3103546688 | 61.518712 | ro | ANDROID | HUAWEI | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2 |
| 4 | TMCZ_6009584899 | 58.044620 | cz | ANDROID | HONOR | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2 |
| 5 | TMCZ_9613733 | 7.695000 | cz | ANDROID | samsung | 0.0 | 0.0 | 4.0 | 0.0 | 1.0 | 0.0 | 0.0 | 2 |